25 research outputs found

    The intrinsic dimension of biological data landscapes

    Get PDF
    Analyzing large volumes of high-dimensional data is an issue of fundamental importance in science and beyond. Several approaches work on the assumption that the important content of a dataset belongs to a manifold whose Intrinsic Dimension (ID) is much lower than the crude large number of coordinates. That manifold however is generally twisted and curved; in addition points on it will be non-uniformly distributed: two factors that make the identification of the ID and its exploitation really hard. Here we propose a new ID estimator using only the distance of the first and the second nearest neighbor of each point in the sample. This extreme minimality enables us to reduce the effects of curvature, of density variation, and the resulting computational cost. The ID estimator is theoretically exact in uniformly distributed data sets, and provides consistent measures in general. When used in combination with block analysis, it allows discriminating the relevant dimensions as a function of the block size. This allows estimating the ID even when the data lie on a manifold perturbed by a high-dimensional noise, a situation often encountered in real world data sets. Upon defining a notion of distance between protein sequences, This tools is used to estimate the ID of protein families, and to assess the consistency of generative models. Moreover, If coupled with a density estimator, our ID allows to measure the density of points by taking into account the space in which they actually lie, thus allowing for a cleaner estimation. Here we move a step further towards an automatic classification of protein sequences by using three new tools: our ID estimator, a density estimator and a clustering algorithm. We present the analysis performed on a Pfam PUA clan, showing that these combined tools allow to successfully separate protein domains into architectures. Finally, we present a generalized model for the estimation of the ID that is able to work in data sets with multiple dimensionalities: taking advantage of Bayesian inference techniques, the method allows discriminating manifolds with different dimensions as well as assigning all the points to the respective manifolds. We test the method on a molecular dynamics trajectory, showing that the folded state has a higher dimension with respect to the unfolded one

    The intrinsic dimension of protein sequence evolution

    Get PDF
    It is well known that, in order to preserve its structure and function, a protein cannot change its sequence at random, but only by mutations occurring preferentially at specific locations. We here investigate quantitatively the amount of variability that is allowed in protein sequence evolution, by computing the intrinsic dimension (ID) of the sequences belonging to a selection of protein families. The ID is a measure of the number of independent directions that evolution can take starting from a given sequence. We find that the ID is practically constant for sequences belonging to the same family, and moreover it is very similar in different families, with values ranging between 6 and 12. These values are significantly smaller than the raw number of amino acids, confirming the importance of correlations between mutations in different sites. However, we demonstrate that correlations are not sufficient to explain the small value of the ID we observe in protein families. Indeed, we show that the ID of a set of protein sequences generated by maximum entropy models, an approach in which correlations are accounted for, is typically significantly larger than the value observed in natural protein families. We further prove that a critical factor to reproduce the natural ID is to take into consideration the phylogeny of sequences

    STAT3 mutation impacts biological and clinical features of T-LGL leukemia

    Get PDF
    STAT3 mutations have been described in 30-40% of T-large granular lymphocyte (T-LGL) leukemia patients, leading to STAT3 pathway activation. Considering the heterogeneity of the disease and the several immunophenotypes that LGL clone may express, the aim of this work was to evaluate whether STAT3 mutations might be associated with a distinctive LGL immunophenotype and/or might be indicative for specific clinical features.Our series of cases included a pilot cohort of 101 T-LGL leukemia patients (68 CD8+/CD4- and 33 CD4+/CD8\ub1) from Padua Hematology Unit (Italy) and a validation cohort of additional 20 patients from Rennes Hematology Unit (France).Our results indicate that i) CD8+ T-LGL leukemia patients with CD16+/CD56- immunophenotype identify a subset of patients characterized by the presence of STAT3 mutations and neutropenia, ii) CD4+/CD8\ub1 T-LGL leukemia are devoid of STAT3 mutations but characterized by STAT5b mutations, and iii) a correlation exists between STAT3 activation and presence of Fas ligand, this molecule resulting highly expressed in CD8+/CD16+/CD56- patients. Experiments with stimulation and inhibition of STAT3 phosphorylation confirmed this relationship. In conclusion, our data show that T-LGL leukemia with specific molecular and phenotypic patterns is associated with discrete clinical features contributing to get insights into molecular bases accounting for the development of Fas ligand-mediated neutropenia

    In Chronic Lymphocytic Leukemia the JAK2/STAT3 Pathway Is Constitutively Activated and Its Inhibition Leads to CLL Cell Death Unaffected by the Protective Bone Marrow Microenvironment

    Get PDF
    The bone marrow microenvironment promotes proliferation and drug resistance in chronic lymphocytic leukemia (CLL). Although ibrutinib is active in CLL, it is rarely able to clear leukemic cells protected by bone marrow mesenchymal stromal cells (BMSCs) within the marrow niche. We investigated the modulation of JAK2/STAT3 pathway in CLL by BMSCs and its targeting with AG490 (JAK2 inhibitor) or Stattic (STAT3 inhibitor). B cells collected from controls and CLL patients, were treated with medium alone, ibrutinib, JAK/Signal Transducer and Activator of Transcription (STAT) inhibitors, or both drugs, in the presence of absence of BMSCs. JAK2/STAT3 axis was evaluated by western blotting, flow cytometry, and confocal microscopy. We demonstrated that STAT3 was phosphorylated in Tyr705 in the majority of CLL patients at basal condition, and increased following co-cultures with BMSCs or IL-6. Treatment with AG490, but not Stattic, caused STAT3 and Lyn dephosphorylation, through re-activation of SHP-1, and triggered CLL apoptosis even when leukemic cells were cultured on BMSC layers. Moreover, while BMSCs hamper ibrutinib activity, the combination of ibrutinib+JAK/STAT inhibitors increase ibrutinib-mediated leukemic cell death, bypassing the pro-survival stimuli derived from BMSCs. We herein provide evidence that JAK2/STAT3 signaling might play a key role in the regulation of CLL-BMSC interactions and its inhibition enhances ibrutinib, counteracting the bone marrow niche

    Targeted activation of the SHP-1/PP2A signaling axis elicits apoptosis of chronic lymphocytic leukemia cells

    Get PDF
    Lyn, a member of the Src family of kinases, is a key factor in the dys-regulation of survival and apoptotic pathways of malignant B cells in chronic lymphocytic leukemia. One of the effects of Lyn's action is spatial and functional segregation of the tyrosine phosphatase SHP-1 into two pools, one beneath the plasma membrane in an active state promoting pro-survival signals, the other in the cytosol in an inhibited conformation and unable to counter the elevated level of cytosolic tyrosine phosphorylation. We herein show that SHP-1 activity can be elicited directly by nintedanib, an agent also known as a triple angiokinase inhibitor, circumventing the phospho-S591-dependent inhibition of the phosphatase, leading to the dephosphorylation of pro-apoptotic players such as procaspase-8 and serine/threonine phosphatase 2A, eventually triggering apoptosis. Furthermore, the activation of PP2A by using MP07-66, a novel FTY720 analog, stimulated SHP-1 activity via dephosphorylation of phospho-S591, which unveiled the existence of a positive feedback signaling loop involving the two phosphatases. In addition to providing further insights into the molecular basis of this disease, our findings indicate that the PP2A/SHP-1 axis may emerge as an attractive, novel target for the development of alternative strategies in the treatment of chronic lymphocytic leukemia

    Lyn sustains oncogenic signaling in chronic lymphocytic leukemia by strengthening SET-mediated inhibition of PP2A.

    Get PDF
    Aberrant protein kinase activities, and the consequent dramatic increase of Ser/Thr and -Tyr phosphorylation, promote the deregulation of the survival pathways in chronic lymphocytic leukemia (CLL), which is crucial to the pathogenesis and progression of the disease. In this study, we show that the tumor suppressor Protein Phosphatase 2A (PP2A), one of the major Ser/Thr phosphatase, is in an inhibited form due to the synergistic contribution of two events, the interaction with its physiological inhibitor SET and the phosphorylation of Y307 of the catalytic subunit of PP2A. The latter event is mediated by Lyn, a Src family kinase previously found to be overexpressed, delocalized and constitutively active in CLL cells. This Lyn/PP2A axis accounts for the persistent high level of phosphorylation of the phosphatase's targets and represents a key connection linking phosphotyrosine- and phosphoserine/threonine-mediated oncogenic signals. The data herein presented show that the disruption of the SET/PP2A complex by a novel FTY720-analogue (MP07-66) devoid of immunosuppressive effects leads to the reactivation of PP2A, which in turn triggers apoptosis of CLL cells. When used in combination with SFK inhibitors, the action of MP07-66 is synergistically amplified, providing a new option in the therapeutic strategy for CLL patients

    Agricultural by-products with bioactive effects: A multivariate approach to evaluate microbial and physicochemical changes in a fresh pork sausage enriched with phenolic compounds from olive vegetation water

    Get PDF
    The use of phenolic compounds derived from agricultural by-products could be considered as an eco-friendly strategy for food preservation. In this study a purified phenol extract from olive vegetation water (PEOVW) was explored as a potential bioactive ingredient for meat products using Italian fresh sausage as food model. The research was developed in two steps: first, an in vitro delineation of the extract antimicrobial activities was performed, then, the PEOVW was tested in the food model to investigate the possible application in food manufacturing. The in vitro tests showed that PEOVW clearly inhibits the growth of food-borne pathogens such as Listeria monocytogenes and Staphylococcus aureus. The major part of Gram-positive strains was inhibited at the low concentrations (0.375–3 mg/mL). In the production of raw sausages, two concentrates of PEOVW (L1:0.075% and L2: 0.15%) were used taking into account both organoleptic traits and the bactericidal effects. A multivariate statistical approach allowed the definition of the microbial and physicochemical changes of sausages during the shelf life (14 days). In general, the inclusion of the L2 concentration reduced the growth of several microbial targets, especially Staphylococcus spp. and LABs (2 log10 CFU/g reduction),while the increasing the growth of yeasts was observed. The reduction of microbial growth could be involved in the reduced lipolysis of raw sausages supplemented with PEOVWas highlighted by the lower amount of diacylglycerols. Moisture and aw had a significant effect on the variability of microbiological features,while food matrix (the sausages' environment) can mask the effects of PEOVW on other targets (e.g. Pseudomonas). Moreover, the molecular identification of the main representative taxa collected during the experimentation allowed the evaluation of the effects of phenols on the selection of bacteria. Genetic data suggested a possible strain selection based on storage time and the addition of phenol compounds especially on LABs and Staphylococcus spp. The modulation effects on lipolysis and the reduction of several microbial targets in a naturally contaminated product indicates that PEOVW may be useful as an ingredient in fresh sausages for improving food safety and quality

    HS1, a Lyn Kinase Substrate, Is Abnormally Expressed in B-Chronic Lymphocytic Leukemia and Correlates with Response to Fludarabine-Based Regimen

    Get PDF
    In B-Chronic Lymphocytic Leukemia (B-CLL) kinase Lyn is overexpressed, active, abnormally distributed, and part of a cytosolic complex involving hematopoietic lineage cell-specific protein 1 (HS1). These aberrant properties of Lyn could partially explain leukemic cells’ defective apoptosis, directly or through its substrates, for example, HS1 that has been associated to apoptosis in different cell types. To verify the hypothesis of HS1 involvement in Lyn-mediated leukemic cell survival, we investigated HS1 protein in 71 untreated B-CLL patients and 26 healthy controls. We found HS1 overexpressed in leukemic as compared to normal B lymphocytes (1.38±0.54 vs 0.86±0.29, p<0.01), and when HS1 levels were correlated to clinical parameters we found a higher expression of HS1 in poor-prognosis patients. Moreover, HS1 levels significantly decreased in ex vivo leukemic cells of patients responding to a fludarabine-containing regimen. We also observed that HS1 is partially localized in the nucleus of neoplastic B cells. All these data add new information on HS1 study, hypothesizing a pivotal role of HS1 in Lyn-mediated modulation of leukemic cells’ survival and focusing, one more time, the attention on the BCR-Lyn axis as a putative target for new therapeutic strategies in this disorder

    Computing the Free Energy without Collective Variables

    No full text
    We introduce an approach for computing the free energy and the probability density in high-dimensional spaces, such as those explored in molecular dynamics simulations of biomolecules. The approach exploits the presence of correlations between the coordinates, induced, in molecular dynamics, by the chemical nature of the molecules. Due to these correlations, the data points lay on a manifold that can be highly curved and twisted, but whose dimension is normally small. We estimate the free energies by finding, with a statistical test, the largest neighborhood in which the free energy in the embedding manifold can be considered constant. Importantly, this procedure does not require defining explicitly the manifold and provides an estimate of the error that is approximately unbiased up to large dimensions. We test this approach on artificial and real data sets, demonstrating that the free energy estimates are reliable for data sets on manifolds of dimension up to 3c10, embedded in an arbitrarily large space. In practical applications our method permits the estimation of the free energy in a space of reduced dimensionality without specifying the collective variables defining this space

    Automatic topography of high-dimensional data sets by non-parametric density peak clustering

    Full text link
    Data analysis in high-dimensional spaces aims at obtaining a synthetic description of a data set, revealing its main structure and its salient features. We here introduce an approach providing this description in the form of a topography of the data, namely a human-readable chart of the probability density from which the data are harvested. The approach is based on an unsupervised extension of Density Peak clustering and on a non-parametric density estimator that measures the probability density in the manifold containing the data. This allows finding automatically the number and the height of the peaks of the probability density, and the depth of the “valleys” separating them. Importantly, the density estimator provides a measure of the error, which allows distinguishing genuine density peaks from density fluctuations due to finite sampling. The approach thus provides robust and visual information about the density peaks height, their statistical reliability and their hierarchical organization, offering a conceptually powerful extension of the standard clustering partitions. We show that this framework is particularly useful in the analysis of complex data sets
    corecore